Add mem_size histogram #618
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The second of three planned PRs for LSDB's #449.
Overall plan:
A little more detail:
threshold_mode. The threshold mode defaults torow_count, but is set tomem_sizeifbyte_pixel_thresholdis in the input args (and is notNone).MEM_SIZE_HISTOGRAM_BINARY_FILEandMEM_SIZE_HISTOGRAMS_DIR.gather_plan, which creates histogram directory(/ies) among other set up stuff.map_to_pixels); and in doing so, we create the histogram. I am electing to make two histograms here, so long as the histogram mode is set to mem_size. Is is ok to just callmr.map_to_pixelsa second time like that?_get_mem_size_of_chunkand its two helpers_get_row_mem_size_data_frameand_get_row_mem_size_pa_tablewhich_histogramtoread_histogramto show that we're reading therow_counthistogram. It's the default, but I wanted to include it for readability/safety.